{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 11.2 Time Series Basics（时间序列基础）\n",
    "\n",
    "在pandas中，一个基本的时间序列对象，是一个用时间戳作为索引的Series，在pandas外部的话，通常是用python 字符串或datetime对象来表示的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "from datetime import datetime"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dates = [datetime(2011, 1, 2), datetime(2011, 1, 5),\n",
    "         datetime(2011, 1, 7), datetime(2011, 1, 8), \n",
    "         datetime(2011, 1, 10), datetime(2011, 1, 12)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    0.384868\n",
       "2011-01-05    0.669181\n",
       "2011-01-07    2.553288\n",
       "2011-01-08   -1.808783\n",
       "2011-01-10    1.180570\n",
       "2011-01-12   -0.928942\n",
       "dtype: float64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts = pd.Series(np.random.randn(6), index=dates)\n",
    "ts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面的转化原理是，datetime对象被放进了DatetimeIndex:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatetimeIndex(['2011-01-02', '2011-01-05', '2011-01-07', '2011-01-08',\n",
       "               '2011-01-10', '2011-01-12'],\n",
       "              dtype='datetime64[ns]', freq=None)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "像其他的Series一行，数值原色会自动按时间序列索引进行对齐："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    0.384868\n",
       "2011-01-07    2.553288\n",
       "2011-01-10    1.180570\n",
       "dtype: float64"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts[::2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    0.769735\n",
       "2011-01-05         NaN\n",
       "2011-01-07    5.106575\n",
       "2011-01-08         NaN\n",
       "2011-01-10    2.361140\n",
       "2011-01-12         NaN\n",
       "dtype: float64"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts + ts[::2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "ts[::2]会在ts中，每隔两个元素选一个元素。\n",
    "\n",
    "pandas中的时间戳，是按numpy中的datetime64数据类型进行保存的，可以精确到纳秒的级别："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('<M8[ns]')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.index.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "DatetimeIndex的标量是pandas的Timestamp对象："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Timestamp('2011-01-02 00:00:00')"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stamp = ts.index[0]\n",
    "stamp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Timestamp可以在任何地方用datetime对象进行替换。\n",
    "\n",
    "# 1 Indexing, Selection, Subsetting（索引，选择，取子集）\n",
    "\n",
    "当我们基于标签进行索引和选择时，时间序列就像是pandas.Series："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    0.384868\n",
       "2011-01-05    0.669181\n",
       "2011-01-07    2.553288\n",
       "2011-01-08   -1.808783\n",
       "2011-01-10    1.180570\n",
       "2011-01-12   -0.928942\n",
       "dtype: float64"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "stamp = ts.index[2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.5532875030792592"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts[stamp]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了方便，我们可以直接传入一个字符串用来表示日期："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.1805698813038874"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts['1/10/2011']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.1805698813038874"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts['20110110']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于比较长的时间序列，我们可以直接传入一年或一年一个月，来进行数据选取："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01   -0.801668\n",
       "2000-01-02   -0.325797\n",
       "2000-01-03    0.047318\n",
       "2000-01-04    0.239576\n",
       "2000-01-05   -0.467691\n",
       "2000-01-06    1.394063\n",
       "2000-01-07    0.416262\n",
       "2000-01-08   -0.739839\n",
       "2000-01-09   -1.504631\n",
       "2000-01-10   -0.798753\n",
       "2000-01-11    0.758856\n",
       "2000-01-12    1.163517\n",
       "2000-01-13    1.233826\n",
       "2000-01-14    0.675056\n",
       "2000-01-15   -1.079219\n",
       "2000-01-16    0.212076\n",
       "2000-01-17   -0.242134\n",
       "2000-01-18   -0.318024\n",
       "2000-01-19    0.040686\n",
       "2000-01-20   -1.342025\n",
       "2000-01-21   -0.130905\n",
       "2000-01-22   -0.122308\n",
       "2000-01-23   -0.924727\n",
       "2000-01-24    0.071544\n",
       "2000-01-25    0.483302\n",
       "2000-01-26   -0.264231\n",
       "2000-01-27    0.815791\n",
       "2000-01-28    0.652885\n",
       "2000-01-29    0.203818\n",
       "2000-01-30    0.007890\n",
       "                ...   \n",
       "2002-08-28   -2.375283\n",
       "2002-08-29    0.843647\n",
       "2002-08-30    0.069483\n",
       "2002-08-31   -1.151590\n",
       "2002-09-01   -2.348154\n",
       "2002-09-02   -0.309723\n",
       "2002-09-03   -1.017466\n",
       "2002-09-04   -2.078659\n",
       "2002-09-05   -1.828568\n",
       "2002-09-06    0.546299\n",
       "2002-09-07    0.861304\n",
       "2002-09-08   -0.823128\n",
       "2002-09-09   -0.150047\n",
       "2002-09-10   -1.984674\n",
       "2002-09-11    0.468010\n",
       "2002-09-12   -0.066440\n",
       "2002-09-13   -1.629502\n",
       "2002-09-14    0.044870\n",
       "2002-09-15    0.007970\n",
       "2002-09-16    0.812104\n",
       "2002-09-17   -1.835575\n",
       "2002-09-18   -0.218055\n",
       "2002-09-19   -0.271351\n",
       "2002-09-20   -1.852212\n",
       "2002-09-21    0.546462\n",
       "2002-09-22    0.776960\n",
       "2002-09-23   -1.140997\n",
       "2002-09-24   -2.213685\n",
       "2002-09-25   -0.586588\n",
       "2002-09-26   -1.472430\n",
       "Freq: D, dtype: float64"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "longer_ts = pd.Series(np.random.randn(1000),\n",
    "                      index=pd.date_range('1/1/2000', periods=1000))\n",
    "longer_ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2001-01-01    0.588405\n",
       "2001-01-02   -3.027909\n",
       "2001-01-03   -0.492280\n",
       "2001-01-04   -0.807809\n",
       "2001-01-05   -0.124139\n",
       "2001-01-06   -0.198966\n",
       "2001-01-07    2.015447\n",
       "2001-01-08    1.454119\n",
       "2001-01-09    0.157505\n",
       "2001-01-10    1.077689\n",
       "2001-01-11   -0.246538\n",
       "2001-01-12   -0.865122\n",
       "2001-01-13   -0.082186\n",
       "2001-01-14    1.928050\n",
       "2001-01-15    0.320741\n",
       "2001-01-16    0.473770\n",
       "2001-01-17    0.036649\n",
       "2001-01-18    1.405034\n",
       "2001-01-19    0.560502\n",
       "2001-01-20   -0.695138\n",
       "2001-01-21    3.318884\n",
       "2001-01-22    1.704966\n",
       "2001-01-23    0.145167\n",
       "2001-01-24    0.366667\n",
       "2001-01-25   -0.565675\n",
       "2001-01-26    0.940406\n",
       "2001-01-27   -1.468772\n",
       "2001-01-28    0.098759\n",
       "2001-01-29    0.267449\n",
       "2001-01-30   -0.221643\n",
       "                ...   \n",
       "2001-12-02    0.002522\n",
       "2001-12-03   -0.046712\n",
       "2001-12-04    1.825249\n",
       "2001-12-05   -1.000655\n",
       "2001-12-06   -0.807582\n",
       "2001-12-07    0.750439\n",
       "2001-12-08    1.531707\n",
       "2001-12-09   -0.195239\n",
       "2001-12-10   -0.087465\n",
       "2001-12-11   -0.041450\n",
       "2001-12-12    1.992200\n",
       "2001-12-13   -0.294916\n",
       "2001-12-14    1.215363\n",
       "2001-12-15    0.029039\n",
       "2001-12-16   -0.165380\n",
       "2001-12-17    1.192535\n",
       "2001-12-18    0.737760\n",
       "2001-12-19    0.044022\n",
       "2001-12-20    0.582560\n",
       "2001-12-21   -0.213569\n",
       "2001-12-22   -0.024512\n",
       "2001-12-23   -1.140873\n",
       "2001-12-24   -1.351333\n",
       "2001-12-25    0.725253\n",
       "2001-12-26   -0.943740\n",
       "2001-12-27   -2.134039\n",
       "2001-12-28   -0.548597\n",
       "2001-12-29    1.497907\n",
       "2001-12-30   -0.594708\n",
       "2001-12-31    0.068177\n",
       "Freq: D, dtype: float64"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "longer_ts['2001']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，字符串'2001'就直接被解析为一年，然后选中这个时期的数据。我们也可以指定月份："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2001-05-01   -0.560227\n",
       "2001-05-02    2.160259\n",
       "2001-05-03   -0.826092\n",
       "2001-05-04   -0.183020\n",
       "2001-05-05   -0.294708\n",
       "2001-05-06   -1.210785\n",
       "2001-05-07    0.609260\n",
       "2001-05-08   -1.155377\n",
       "2001-05-09   -0.127132\n",
       "2001-05-10    0.576327\n",
       "2001-05-11   -0.955206\n",
       "2001-05-12   -2.002019\n",
       "2001-05-13   -0.969865\n",
       "2001-05-14    0.820993\n",
       "2001-05-15    0.557336\n",
       "2001-05-16   -0.262222\n",
       "2001-05-17   -0.086760\n",
       "2001-05-18    0.151608\n",
       "2001-05-19    1.097604\n",
       "2001-05-20    0.212148\n",
       "2001-05-21   -1.164944\n",
       "2001-05-22   -0.100020\n",
       "2001-05-23    0.734738\n",
       "2001-05-24    1.730438\n",
       "2001-05-25    1.352858\n",
       "2001-05-26    0.644984\n",
       "2001-05-27    0.997554\n",
       "2001-05-28    1.434452\n",
       "2001-05-29    0.395946\n",
       "2001-05-30   -0.142523\n",
       "2001-05-31    1.205485\n",
       "Freq: D, dtype: float64"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "longer_ts['2001-05']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "利用datetime进行切片（slicing）也没问题："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2.5532875030792592"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts[datetime(2011, 1, 7)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "因为大部分时间序列是按年代时间顺序来排列的，我们可以用时间戳来进行切片，选中一段范围内的时间："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    0.384868\n",
       "2011-01-05    0.669181\n",
       "2011-01-07    2.553288\n",
       "2011-01-08   -1.808783\n",
       "2011-01-10    1.180570\n",
       "2011-01-12   -0.928942\n",
       "dtype: float64"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-07    2.553288\n",
       "2011-01-08   -1.808783\n",
       "2011-01-10    1.180570\n",
       "dtype: float64"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts['1/6/2011':'1/11/2011']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "记住，这种方式的切片得到的只是原来数据的一个视图，如果我们在切片的结果上进行更改的的，原来的数据也会变化。\n",
    "\n",
    "有一个相等的实例方法（instance method）也能切片，truncate，能在两个日期上，对Series进行切片："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2011-01-02    0.384868\n",
       "2011-01-05    0.669181\n",
       "2011-01-07    2.553288\n",
       "2011-01-08   -1.808783\n",
       "dtype: float64"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.truncate(after='1/9/2011')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "所有这些都适用于DataFrame，我们对行进行索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dates = pd.date_range('1/1/2000', periods=100, freq='W-WED')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "long_df = pd.DataFrame(np.random.randn(100, 4),\n",
    "                       index=dates,\n",
    "                       columns=['Colorado', 'Texas',\n",
    "                                'New York', 'Ohio'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Colorado</th>\n",
       "      <th>Texas</th>\n",
       "      <th>New York</th>\n",
       "      <th>Ohio</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2001-05-02</th>\n",
       "      <td>-0.477517</td>\n",
       "      <td>0.722685</td>\n",
       "      <td>0.337141</td>\n",
       "      <td>-0.345072</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2001-05-09</th>\n",
       "      <td>-0.401860</td>\n",
       "      <td>-0.475821</td>\n",
       "      <td>0.685129</td>\n",
       "      <td>-0.809288</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2001-05-16</th>\n",
       "      <td>1.900541</td>\n",
       "      <td>0.348590</td>\n",
       "      <td>-0.805042</td>\n",
       "      <td>-0.410077</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2001-05-23</th>\n",
       "      <td>-0.220870</td>\n",
       "      <td>1.654666</td>\n",
       "      <td>-0.846395</td>\n",
       "      <td>-0.207802</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2001-05-30</th>\n",
       "      <td>2.094319</td>\n",
       "      <td>-0.972588</td>\n",
       "      <td>1.276059</td>\n",
       "      <td>-1.056146</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Colorado     Texas  New York      Ohio\n",
       "2001-05-02 -0.477517  0.722685  0.337141 -0.345072\n",
       "2001-05-09 -0.401860 -0.475821  0.685129 -0.809288\n",
       "2001-05-16  1.900541  0.348590 -0.805042 -0.410077\n",
       "2001-05-23 -0.220870  1.654666 -0.846395 -0.207802\n",
       "2001-05-30  2.094319 -0.972588  1.276059 -1.056146"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "long_df.loc['5-2001']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2 Time Series with Duplicate Indices（重复索引的时间序列）\n",
    "\n",
    "在某些数据中，可能会遇到多个数据在同一时间戳下的情况："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dates = pd.DatetimeIndex(['1/1/2000', '1/2/2000', '1/2/2000', \n",
    "                          '1/2/2000', '1/3/2000'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01    0\n",
       "2000-01-02    1\n",
       "2000-01-02    2\n",
       "2000-01-02    3\n",
       "2000-01-03    4\n",
       "dtype: int64"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dup_ts = pd.Series(np.arange(5), index=dates)\n",
    "dup_ts"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们通过is_unique属性来查看index是否是唯一值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dup_ts.index.is_unique"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对这个时间序列取索引的的话， 要么得到标量，要么得到切片，这取决于时间戳是否是重复的："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dup_ts['1/3/2000'] # not duplicated"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-02    1\n",
       "2000-01-02    2\n",
       "2000-01-02    3\n",
       "dtype: int64"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dup_ts['1/2/2000'] # duplicated"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "假设我们想要聚合那些有重复时间戳的数据，一种方法是用groupby，设定level=0："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01    0\n",
       "2000-01-02    2\n",
       "2000-01-03    4\n",
       "dtype: int64"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grouped = dup_ts.groupby(level=0)\n",
    "grouped.mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2000-01-01    1\n",
       "2000-01-02    3\n",
       "2000-01-03    1\n",
       "dtype: int64"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "grouped.count()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [py35]",
   "language": "python",
   "name": "Python [py35]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
